reward hacking
Paperclip maximizer
– Hypothesis about intelligent agents
Outer alignment
– Conformance of AI to intended objectives
Perverse incentive
– Incentive with unintended results
https://en.wikipedia.org/wiki/Reward_hacking